Locality-Aware Many-Core Garbage Collection
نویسندگان
چکیده
The wide-scale deployment of multi-core and many-core processors will necessitate fundamental changes to garbage collectors. Highly parallel garbage collection is critical to the performance of these systems — today’s garbage collectors can quickly become the bottleneck for parallel programs. These processors will present additional new challenges — many contain non-uniform memory architectures in which some cores have faster access to certain regions of memory than other regions. This paper presents a new cache-aware approach to garbage collection. Our collector balances the competing concerns of data locality and heap utilization to improve performance. We have implemented our garbage collector and present results on a 64-core TILEPro64 processor. Our cache-aware parallel collector speeds up garbage collection by up to 46.7×.
منابع مشابه
Locality-Aware GC Optimisations for Big Data Workloads
Many Big Data analytics and IoT scenarios rely on fast and non-relational storage (NoSQL) to help processing massive amounts of data. In addition, managed runtimes (e.g. JVM) are now widely used to support the execution of these NoSQL storage solutions, particularly when dealing with Big Data key-value store-driven applications. The benefits of such runtimes can however be limited by automatic ...
متن کاملIncorporating Locality Management into Garbage Collection in Massively Parallel Object-Oriented Languages
This paper discusses how locality between objects a ects the performance, and proposes a software architecture for enhancing locality while keeping load-balance reasonable at the minimum sacri ce of runtime overhead. Objects are created locally by default and long-lived objects are selectively migrated during garbage collection. By enhancing locality, message passings are likely to be local and...
متن کاملTopology-Aware Parallelism for NUMA Copying Collectors
NUMA-aware parallel algorithms in runtime systems attempt to improve locality by allocating memory from local NUMA nodes. Researchers have suggested that the garbage collector should profile memory access patterns or use object locality heuristics to determine the target NUMA node before moving an object. However, these solutions are costly when applied to every live object in the reference gra...
متن کاملORDER: Object centRic DEterministic Replay for Java
Deterministic replay systems, which record and replay non-deterministic events during program execution, have many applications such as bug diagnosis, intrusion analysis and fault tolerance. It is well understood how to replay native (e.g., C) programs on multi-processors, while there is little work for concurrent java applications on multicore. State-of-the-art work for Java either assumes dat...
متن کاملUniprocessor Garbage Collection Techniques
We survey basic garbage collection algorithms and variations such as incremental and gen erational collection The basic algorithms include reference counting mark sweep mark compact copy ing and treadmill collection Incremental techniques can keep garbage collection pause times short by interleaving small amounts of collection work with program execution Generational schemes improve e ciency an...
متن کامل